- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources1
- Resource Type
-
0000100000000000
- More
- Availability
-
10
- Author / Contributor
- Filter by Author / Creator
-
-
Kang, E H (1)
-
Kumar, P R (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
& Ahmed, K. (0)
-
& Ahmed, Khadija. (0)
-
& Aina, D.K. Jr. (0)
-
& Akcil-Okan, O. (0)
-
& Akuom, D. (0)
-
& Aleven, V. (0)
-
& Andrews-Larson, C. (0)
-
& Archibald, J. (0)
-
& Arnett, N. (0)
-
& Arya, G. (0)
-
& Attari, S. Z. (0)
-
& Ayala, O. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Interactive decision making, encompassing bandits, contextual bandits, and reinforcement learning, has recently been of interest to theoretical studies of experimentation design and recommender system algorithm research. Recently, it has been shown that the wellknown Graves-Lai constant being zero is a necessary and sufficient condition for achieving bounded (or constant) regret in interactive decision making. As this condition may be a strong requirement for many applications, the practical usefulness of pursuing bounded regret has been questioned. In this paper, we show that the condition of the Graves-Lai constant being zero is also necessary to achieve delay model robustness when reward delays are unknown (i.e., when feedbacks are anonymous). Here, model robustness is measured in terms of ✏-robustness, one of the most widely used and one of the least adversarial robustness concepts in the robust statistics literature. In particular, we show that ✏-robustness cannot be achieved for a consistent (i.e., uniformly sub-polynomial regret) algorithm however small the nonzero ✏ value is when the Grave-Lai constant is not zero. While this is a strongly negative result, we also provide a positive result for linear rewards models (Linear contextual bandits, Reinforcement learning with linear MDP) that the Grave-Lai constant being zero is also sufficient for achieving bounded regret without any knowledge of delay models, i.e., the best of both the efficiency world and the delay robustness world.more » « less
An official website of the United States government

Full Text Available